NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Automated Assessment of Critical View of Safety in Laparoscopic Cholecystectomy

https://doi.org/10.1109/ICHI57859.2023.00051

Li, Yunfan; Gupta, Himanshu; Ling, Haibin; Ramakrishnan, IV; Prasanna, Prateek; Georgakis, Georgios; Sasson, Aaron (June 2023, IEEE)

Full Text Available
Automated Assessment of Critical View of Safety in Laparoscopic Cholecystectomy

Li, Yunfan; Gupta, Himanshu; Ling, Haibin; Ramakrishnan, IV; Georgakis, Georgios; Sasson, Aaron; Prasanna, Prateek (January 2023, ritical View of Safety in Laparoscopic Cholecystectomy.)

Full Text Available
Uncertainty-driven Planner for Exploration and Navigation

https://doi.org/10.1109/ICRA46639.2022.9812423

Georgakis, Georgios; Bucher, Bernadette; Arapin, Anton; Schmeckpeper, Karl; Matni, Nikolai; Daniilidis, Kostas (May 2022, 2022 International Conference on Robotics and Automation (ICRA))

Full Text Available
Cross-modal Map Learning for Vision and Language Navigation

https://doi.org/10.1109/CVPR52688.2022.01502

Georgakis, Georgios; Schmeckpeper, Karl; Wanchoo, Karan; Dan, Soham; Miltsakaki, Eleni; Roth, Dan; Daniilidis, Kostas (June 2022, IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))

We consider the problem of Vision-and-Language Navigation (VLN). The majority of current methods for VLN are trained end-to-end using either unstructured memory such as LSTM, or using cross-modal attention over the egocentric observations of the agent. In contrast to other works, our key insight is that the association between language and vision is stronger when it occurs in explicit spatial representations. In this work, we propose a cross-modal map learning model for vision-and-language navigation that first learns to predict the top-down semantics on an egocentric map for both observed and unobserved regions, and then predicts a path towards the goal as a set of way-points. In both cases, the prediction is informed by the language through cross-modal attention mechanisms. We experimentally test the basic hypothesis that language-driven navigation can be solved given a map, and then show competitive results on the full VLN-CE benchmark.
more » « less
Full Text Available
Learning to Map for Active Semantic Goal Navigation

Georgakis, Georgios; Bucher, Bernadette; Schmeckpeper, Karl; Singh, Siddharth; Daniilidis, Kostas (January 2022, The Tenth International Conference on Learning Representations (ICLR 2022))

Full Text Available
Object-centric Video Prediction without Annotation

https://doi.org/10.1109/ICRA48506.2021.9561541

Schmeckpeper, Karl; Georgakis, Georgios; Daniilidis, Kostas (May 2021, 2021 IEEE International Conference on Robotics and Automation (ICRA))

Full Text Available
Cross-Modal Map Learning for Vision and Language Navigation

Georgakis, Georgios; Schmeckpeper, Karl; Wanchoo, Karan; Dan, Soham; Miltsakaki, Eleni; Roth, Dan; Daniilidis, Kostas (January 2022, Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR))

Full Text Available

Search for: All records